77 research outputs found

    Financial time series representation using multiresolution important point retrieval method

    Get PDF
    Financial time series analysis usually conducts by determining the series important points. These important points which are the peaks and the dips indicate the affecting of some important factors or events which are available both internal factors and external factors. The peak and the dip points of the series may appear frequently in multiresolution over time. However, to manipulate financial time series, researchers usually decrease this complexity of time series in their techniques. Consequently, transfonning the time series into another easily understanding representation is usually considered as an appropriate approach. In this paper, we propose a multiresolution important point retrieval method for financial time series representation. The idea of the method is based on finding the most important points in multiresolution. These retrieved important points are recorded in each resolution. The collected important points are used to construct the TS-binary search tree. From the TS-binary search tree, the application of time series segmentation is conducted. The experimental results show that the TS-binary search tree representation for financial time series exhibits different performance in different number of cutting points, however, in the empirical results, the number of cutting points which are larger than 12 points show the better results

    Unsupervised Anomaly Detection with Unlabeled Data Using Clustering

    Get PDF
    Intrusions pose a serious security risk in a network environment. New intrusion types, of which detection systems are unaware, are the most difficult to detect. The amount of available network audit data instances is usually large; human labeling is tedious, time-consuming, and expensive. Traditional anomaly detection algorithms require a set of purely normal data from which they train their model. We present a clustering-based intrusion detection algorithm, unsupervised anomaly detection, which trains on unlabeled data in order to detect new intrusions. Our method is able to detect many different types of intrusions, while maintaining a low false positive rate as verified over the Knowledge Discovery and Data Mining - KDD CUP 1999 dataset

    Fuzzy C-Mean And Genetic Algorithms Based Scheduling For Independent Jobs In Computational Grid

    Get PDF
    The concept of Grid computing is becoming the most important research area in the high performance computing. Under this concept, the jobs scheduling in Grid computing has more complicated problems to discover a diversity of available resources, select the appropriate applications and map to suitable resources. However, the major problem is the optimal job scheduling, which Grid nodes need to allocate the appropriate resources for each job. In this paper, we combine Fuzzy C-Mean and Genetic Algorithms which are popular algorithms, the Grid can be used for scheduling. Our model presents the method of the jobs classifications based mainly on Fuzzy C-Mean algorithm and mapping the jobs to the appropriate resources based mainly on Genetic algorithm. In the experiments, we used the workload historical information and put it into our simulator. We get the better result when compared to the traditional algorithms for scheduling policies. Finally, the paper also discusses approach of the jobs classifications and the optimization engine in Grid scheduling

    Development of intelligent hybrid learning system using clustering and knowledge-based neural networks for economic forecasting : First phase

    Get PDF
    The economic forecasting environment is currently undergoing drastic changes and has a complex and challenging task.Practically, people design a database application or use a statistical package to conduct the analysis on the data.Former approach can be done on the online data, but it must be developed after stating the goal of analysis, which means it only possible for a limited and specific purpose.Whereas the statistical approach must be done for the offline data, however it can lead to the missing pattern and undiscovered knowledge from the available data (Shan, C., 1998).For the effort to extract implicit, previously unknown, hidden and potentially useful information from raw data in an automatic fashion, leads us to the usage of data mining technique that receives big attention from the researchers recently.This paper proposed the issues of joint clustering and knowledge-based neural networks techniques as the application for point forecast decision making.Future prediction (e.g., political condition, corporation factors, macro economy factors, and psychological factors of investors) perform an important rule in Stock Exchange, so in our prediction model we will be able to predict results more precisely. We proposed KMeans clustering algorithm that is based on multidimensional scaling, joined with neural knowledge based technique algorithm for supporting the learning module to generate interesting clusters that will generate interesting rules for extracting knowledge from stock exchange databases efficiently and accurately

    An initial state of design and development of intelligent knowledge discovery system for stock exchange database

    Get PDF
    Data mining is a challenging matter in research field for the last few years.Researchers are using different techniques in data mining.This paper discussed the initial state of Design and Development Intelligent Knowledge Discovery System for Stock Exchange (SE) Databases. We divide our problem in two modules.In first module we define Fuzzy Rule Base System to determined vague information in stock exchange databases.After normalizing massive amount of data we will apply our proposed approach, Mining Frequent Patterns with Neural Networks.Future prediction (e.g., political condition, corporation factors, macro economy factors, and psychological factors of investors) perform an important rule in Stock Exchange, so in our prediction model we will be able to predict results more precisely.In second module we will generate clustering algorithm. Generally our clustering algorithm consists of two steps including training and running steps.The training step is conducted for generating the neural network knowledge based on clustering.In running step, neural network knowledge based is used for supporting the Module in order to generate learned complete data, transformed data and interesting clusters that will help to generate interesting rules

    Pengelompokan Data Kaji Cuaca Menggunakan Teknik Engelompokan Hierarki Agglomerative Bagi Peramalan Taburan Hujan

    Get PDF
    Kertas kerja ini melaporkan penggunaan teknik pengelompokan hierarki Agglomerative bagi melakukan peramalan taburan hujan. Tujuan utama kajian ini adalah untuk melihat keberkesanan serta prestasi algoritma yang terdapat di dalam teknik pengelompokan hierarki. Kertas kerja ini bermula dengan penerangan ke atas pengelompokan hierarki yang memfokus kepada algoritma Single Link, Average Link dan Complete Link. Melalui penggunaan algoritma-algoritma tersebut, kelompok dihasilkan berdasarkan pembentukan susunan skema pengelompokan dengan mengurangkan jumlah kelompok bagi setiap proses. Kelompok yang dihasilkan, diperolehi daripada gabungan kelompok-kelompok yang terhampir (sama) kepada satu kelompok. Kelompok-kelompok yang dihasilkan melalui ketiga-tiga algoritma tersebut akan digunakan sebagai input bagi melakukan peramalan taburan hujan. Langkah-langkah yang terlibat di dalam proses pengelompokan ini akan diterangkan dengan lebih jelas di dalam bahagian metodologi kajian. Seterusnya, kertas kerja ini akan menerangkan mengenai eksperimen yang dilakukan ke atas kelompok-kelompok yang dihasilkan dengan menggunakan ketiga-tiga algoritma di atas. Pengukuran prestasi pengelompokan dibuat berdasarkan hasil pengelompokan ialah nilai ralat min punca kuasa dua (RMS) dan nilai pekali kolerasi yang dihasilkan di dalam setiap eksperimen yang telah dijalankan. Hasil kajian menunjukkan bahawa peramalan taburan hujan yang terbaik diperolehi melalui penggunaan algoritma Complete-Link

    Selection of defuzzification method to obtain crisp value for representing uncertain data in a modified sweep algorithm

    Get PDF
    We present a study of using fuzzy-based parameters for solving public bus routing problem where demand is uncertain. The fuzzy-based parameters are designed to provide data required by the route selection procedure. The uncertain data are represented as linguistic values which are fully dependent on the users preference. This paper focuses on the selection of the Defuzzification method to discover the most appropriate method for obtaining crisp values which represent uncertain data. We also present a step by step evaluation showing that the fuzzy-based parameters are capable to represent uncertain data replacing the use of exact data which common route selection algorithms usually use.Facultad de Informátic

    Clustering pests of rice using self organizing map

    Get PDF
    Rice, Oryza sativa, also called paddy rice, common rice, lowland and upland.rice. This food grain is produced at least 95 countries around the globe, with China producing 36% of the world's production in 1999, followed by India at 21%, Indonesia at 8%, Bangladesh and Vietnam each producing about 5%. The United States produced about 1.5% of the world's accounts for about 15% of the annual world exports of rice. However the Modern agriculture is influenced by both the pressure for increased productivity and increased stresses caused by plant pests. Geographical Information Systems and Global Positioning Systems are currently being used for variable rate application of pesticides, herbicide and fertilizers in Precision Agriculture applications, but the comparatively lesserused tools of Neural Network can be of additional value in integrated pest management practices. This study details spatial analysis and clustering using Neural Network based on Kohonen Self Organizing map (SOM) as applied to integrated agricultural rice pest management in Malaysia

    An optimized self organizing map for cluster ambiguity detection

    Get PDF
    The Self Organizing Map (SOM) proposed by T.Kohonen (1982), has been widely used in industrial applications such as pattern recognition, biological modelling, data compression, signal processing and data mining (T. Kohonen, 1997; M.N.M Sap and E. Mohebi, 2008a, 2008b, 2008c). It is an unsupervised and nonparametric neural network approach. The success of the SOM algorithm lies in its simplicity that makes it easy to understand, simulate and be used in many applications. The basic SOM consists of neurons usually arranged in a two-dimensional structure such that there are neighbourhood relations among the neurons. After completion of training, each neuron is attached to a feature vector of the same dimension as input space. By assigning each input vector to the neuron with nearest feature vectors, the SOM is able to divide the input space into regions (clusters) with common nearest feature vectors. This process can be considered as performing vector quantization (VQ) (R.M. Gray, 1984). In addition, because of the neighborhood relation contributed by the inter-connections among neurons, the SOM exhibits another important property of topology preservation. Clustering algorithms attempt to organize unlabeled input vectors into clusters such that points within the cluster are more similar to each other than vectors belonging to different clusters (N. R. Pal, et al., 1993). The clustering methods are of five types: hierarchical clustering, partitioning clustering, density-based clustering, grid-based clustering and model-based clustering (J. Han and M. Kamber, 2000). The rough set theory employs two upper and lower thresholds in the clustering process, which result in a rough clusters appearance. This technique also could be defined in incremental order i.e. the number of clusters is not predefined by users. In this chapter, a new two-level clustering algorithm is proposed. The idea is that the first level is to train the data by the SOM neural network and then clustering at the second level is a rough set based incremental clustering approach (S. Ashraf, et al., 2006), which will be applied on the output of SOM and requires only a single neurons scan. The optimal number of clusters can be found by rough set theory, which groups the given neurons into a set of overlapping clusters (clusters the mapped data respectively). Then the overlapped neurons will be assigned to the true clusters they belong to, by apply simulated annealing algorithm. A simulated annealing algorithm has been adopted to minimize the uncertainty that comes from some clustering operations. In our previous work (M.N.M. Sap and E. Mohebi, 2008a) the hybrid SOM and rough set has been applied to catch the overlapped data only, but the experiment results show that the proposed algorithm (SA-Rough SOM) outperforms the previous one. This chapter is organized as following; in section 2, the basics of SOM algorithm are outlined. The Incremental Clustering and Rough set theory are described in section 3. In section 4, the essence of simulated annealing is described. The proposed algorithm is presented in section 5. Section 6 is dedicated to experiment results, section 7 provides brief conclusion, and future works and an outline of the chapter summary is described in section 8
    corecore